SHEF-Multimodal: Grounding Machine Translation on Images
نویسندگان
چکیده
This paper describes the University of Sheffield’s submission for the WMT16 Multimodal Machine Translation shared task, where we participated in Task 1 to develop German-to-English and Englishto-German statistical machine translation (SMT) systems in the domain of image descriptions. Our proposed systems are standard phrase-based SMT systems based on the Moses decoder, trained only on the provided data. We investigate how image features can be used to re-rank the n-best list produced by the SMT model, with the aim of improving performance by grounding the translations on images. Our submissions are able to outperform the strong, text-only baseline system for both directions.
منابع مشابه
Emergent Translation in Multi-Agent Communication
While most machine translation systems to date are trained on large parallel corpora, humans learn language in a different way: by being grounded in an environment and interacting with other humans. In this work, we propose a communication game where two agents, native speakers of their own respective languages, jointly learn to solve a visual referential task. We find that the ability to under...
متن کاملMergent T Ranslation in M Ulti - a Gent C Ommunication
While most machine translation systems to date are trained on large parallel corpora, humans learn language in a different way: by being grounded in an environment and interacting with other humans. In this work, we propose a communication game where two agents, native speakers of their own respective languages, jointly learn to solve a visual referential task. We find that the ability to under...
متن کاملSheffield MultiMT: Using Object Posterior Predictions for Multimodal Machine Translation
This paper describes the University of Sheffield’s submission to the WMT17 Multimodal Machine Translation shared task. We participated in Task 1 to develop an MT system to translate an image description from English to German and French, given its corresponding image. Our proposed systems are based on the state-of-the-art Neural Machine Translation approach. We investigate the effect of replaci...
متن کاملMultimodal Pivots for Image Caption Translation
We present an approach to improve statistical machine translation of image descriptions by multimodal pivots defined in visual space. Image similarity is computed by a convolutional neural network and incorporated into a target-side translation memory retrieval model where descriptions of most similar images are used to rerank translation outputs. Our approach does not depend on the availabilit...
متن کاملAn empirical study on the effectiveness of images in Multimodal Neural Machine Translation
In state-of-the-art Neural Machine Translation (NMT), an attention mechanism is used during decoding to enhance the translation. At every step, the decoder uses this mechanism to focus on different parts of the source sentence to gather the most useful information before outputting its target word. Recently, the effectiveness of the attention mechanism has also been explored for multimodal task...
متن کامل